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Abstract 

In image analysis, many tasks require representing two-dimensional (2D) shape, often specified by a set 
of 2D points, for comparison purposes. The challenge of the representation is that it must not only capture the 
characteristics of the shape but also be invariant to relevant transformations. Invariance to geometric transformations, 
such as translation, rotation, and scale, has received attention in the past, usually under the assumption that the 
points are previously labeled, i.e., that the shape is characterized by an ordered set of landmarks. However, in 
many practical scenarios, the points describing the shape are obtained from automatic processes, e.g., edge or 
corner detection, thus without labels or natural ordering. Obviously, the combinatorial problem of computing the 
correspondences between the points of two shapes in the presence of the aforementioned geometrical distortions 
becomes a quagmire when the number of points is large. We circumvent this problem by representing shapes in a 
way that is invariant to the permutation of the landmarks, i.e., we represent bags of unlabeled 2D points. Within 
our framework, a shape is mapped to an analytic function on the complex plane, leading to what we call its analytic 
signature (ANSIG). To store an ANSIG, it suffices to sample it along a closed contour in the complex plane. We 
show that the ANSIG is a maximal invariant with respect to the permutation group, i.e., that different shapes have 
different ANSIGs and shapes that differ by a permutation (or re-labeling) of the landmarks have the same ANSIG. 
We further show how easy it is to factor out geometric transformations when comparing shapes using the ANSIG 
representation. Finally, we illustrate these capabilities with shape-based image classification experiments. 

Index Terms 

Sets of unlabeled points, permutation invariance, 2D shape representation, shape theory, shape recognition, 
shape-based classification, analytic function, analytic signature (ANSIG). 

I. Introduction 

This paper deals with the representation of two-dimensional (2D) shape. In our context, a 2D shape 
is described by the 2D coordinates of a set of unlabeled points, or landmarks. We seek efficient ways 
to represent such sets, in particular we seek representations that are suitable to be used in shape-based 
recognition tasks. Besides being discriminative, those representations should be invariant to (or, at least, 
deal well with) shape-preserving geometric transformations. Above all, such representations must deal with 
the fact that the landmarks do not have labels. This means that the representation should be invariant to the 
order by which the landmarks are stored, since different orderings of the same set of landmarks represent 
the same shape. We thus focus on developing permutation invariant, or label invariant, representations 
for 2D shape. 



A. Shape representation 

Many objects are primarily recognized by their shape, rather than their color or texture (T), Q. Although 
this fact has been confirmed by surveys showing that users would prefer to retrieve images from shape 
queries [3], the majority of content-based image retrieval systems still use color and texture features to 
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compare images. In fact, shape-based classification proved to be a very hard task, remaining an open 
problem, underlying which is the fundamental question of how to represent shape. 

When the shape is described by a set of labeled landmarks, an established theory provides tools to cope 
with geometric transformations and shape variations: the statistical theory of shape (3, (3- Although this 
theory has lead to significant results when the shapes to compare are characterized by feature points whose 
correspondences from image to image can be obtained, in many practical scenarios, such correspondences 
are not available. We thus focus on unlabeled data. 

The majority of the methods to cope with shapes described by unlabeled sets of points focus on 
representing a "blob", i.e., a shape that is a simply-connected set of points. A number of techniques, 
usually called region-based, describe these shapes by using moment descriptors, e.g., geometrical (3, 
Legendre Q, Zernike (3,0, and Tchebichef (3 moments. Other approaches, contour-based, represent 
the boundary of the shape using, e.g., curvature scale space ifTOlh wavelets ifTTTl . contour displacements lfT3 . 
splines [13J, or Fourier descriptors lfT4l . lfT5lh [16]. Some of these representations exhibit desired invariance 
to geometric transformations but they are restricted to shapes well described by closed contours. 

In image analysis, shape cues come primarily from the image edges. In general, it is hard to extract 
complete contours when dealing with real images IfTTL thus researchers also developed local shape 
descriptors that, at each point of the shape, capture the relative distribution of the remaining points, 
e.g., shape contexts lfT8l and distance multisets lfT9ll . These local representations proved to cope with the 
contour discontinuities typical of the output of automatic edge detection processes but do not deal with 
general shapes and geometric transformations. In this paper we seek ways to describe shapes characterized 
by arbitrary sets of points. 

A number of approaches to deal with shapes described by general sets of unlabeled points are motivated 
by the need to register the corresponding images, i.e., to compute the rigid transformation that best 
"aligns" them. The majority of these registration methods are inspired by the fact that the solution for 
the rigid transformation is easily computed when the labels, i.e., the point correspondences, are known 
— the Procrustes matching problem. To cope with unlabeled points, they develop iterative algorithms 
that compute, in alternate steps, the rigid registration parameters and the point correspondences. One of 
the better known examples is the Iterative Closest Point (ICP) algorithm GUI . More recently, others have 
proposed statistical methods that use "soft" correspondences EEL (23, (23, (241, (23, E51 , leading to 
two-step iterative algorithms based on Expectation-Maximization (EM) [27 j. Although these methods have 
dealt with challenging scenarios, including shape part decomposition (241 and nonrigid registration (23, 
they have the limitations of iterative algorithms, including the uncertain convergence and the sensitivity 
to initialization. 

Rather than attempting to infer the labels of the points describing the shapes to compare, i.e., rather than 
computing the permutation between the two sets of points, we seek permutation invariant representations. 
The relevance of permutation invariance in learning tasks has been recently pointed out (23 , (29L [30J, 
(3TTh (33 . In these works, the permutation is factored out after being computed as the solution of a 
convex optimization problem over the permutation matrices. However, this formulation does not deal with 
geometric transformations such as the rigid rotation of the point set. A permutation invariant representation 
recently proposed describes a shape by the set of all pairwise distances between the shape points (33l . (34l . 
(351 . Naturally, the dimensionality of the this representation limits its applicability to shapes described 
by small sets of landmarks, like fingerprint minutiae. We focus on large point sets, as those typically 
obtained from the edges of real images. 

B. Our approach: the analytic signature of a shape 

Our approach is rooted on a new permutation invariant, or label invariant, representation for sets of 
2D points. We represent a 2D shape by what we call its analytic signature (ANSIG), an analytic function 
defined over the complex plane. We show that shapes that differ by a re-labeling of the landmarks 
(i.e., by a re-ordering of the vector containing the point set) have the same ANSIG. Thus, the ANSIG 
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representation is permutation invariant, or label invariant. Furthermore, we show that this representation 
enables discriminating different shapes, i.e., geometrically different shapes are in fact represented by 
different ANSIGs. 

The impact of the ANSIG representation in shape-based classification tasks is obvious: while methods 
that use less powerful (i.e., "less invariant") representations usually require several prototypes of each class 
(i.e., several training examples to perform statistical learning) and sophisticated classification schemes, in 
our case, shape-based classification boils down to comparing the ANSIG of a candidate shape with the 
one of a single prototype shape per class. As any analytic function, the ANSIG of a shape is completely 
described by the values it takes in a closed contour on the complex plane. Thus, we store ANSIGs by 
simply sampling them on the unit-circle. To compare ANSIGs, i.e., to evaluate shape similarity, it suffices 
to compute the angle between the vectors collecting those samples. 

Although the ANSIG representation is not invariant with respect to geometric transformations, such as 
translation, rotation, and scale, they are easily taken care of. While an adequate pre-processing step factors 
out translation and scale, the rotation that best aligns two shapes is easily obtained from their ANSIGs. 
In fact, the rotation that maximizes the above mentioned similarity is efficiently computed by using the 
Fast Fourier Transform (FFT) algorithm. 

In many practical scenarios the set of points describing a shape is obtained from an image, as the 
output of an automatic process, e.g., edge detection or simple thresholding. Thus, it is natural to expect 
that, besides being noisy, the sets of 2D points to compare have distinct cardinality (particularly when 
comparing shapes obtained from images of different sizes). We show how the ANSIG representation also 
deals well with this kind of perturbations. 

To illustrate the invariance properties of the ANSIG representation, we present experiments with 
synthetic data. To demonstrate its usefulness in shape-based classification, we report results obtained 
with real images. Other experiments are in preliminary versions of this work 061 . 071 . 

C. Paper organization 

In Section HI we address the construction of permutation invariant representations for a set of 2D points. 
We show that what we call the analytic signature (ANSIG) of a 2D shape is such a representation and 
discuss how to store it. Section [Till describes how shape-preserving transformations are easily taken care 
of by the ANSIG representation. In Section [IV] we discuss implementation issues that arise from the 
need to efficiently compare ANSIGs, for the purpose of shape-based classification. Section [V] contains 
experiments and Section [VI] concludes the paper. 

II. Permutation invariance: the analytic signature of a shape 

We consider that a 2D shape is described by a set of n unlabeled points, or landmarks, in the plane. 
Thus, under the usual identification of M 2 with the complex plane C, a 2D shape can be represented by 
a vector 
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However, since there are not labels for the landmarks, the order by which they are stored is irrelevant and 
the choice in (OQ) is not unique, i.e., the same shape is equivalently represented by any vector in the set 

{Uz : n e II(n)}, (2) 

where II (n) denotes the set of n x n permutation matrice^]. 

Our goal is to develop efficient ways to represent the vector set in ©. 

l A permutation matrix is a square matrix with exactly one entry equal to 1 per row and per column and the remaining entries equal to 0. 
Each element in II (n) represents a specific permutation: when multiplied by a vector z produces a vector Uz that has the same entries of 
z but arranged in a possibly different order. The cardinality of II (n) is n\. 
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A. Shapes as points in a quotient space 

In group theory parlance, we say that a shape defined as a set like in © is a point in the quotient space 
C n /II(n). Here, we view II (n) as a group, whose action on C n is the left matrix multiplication, i.e., the 
group operation II (n) x C n —> C n is simply II • z = Hz. This group action induces a partition of C n into 
disjoint orbits: the orbit passing through z collects all possible ways of representing the shape in vector 
z, i.e., it is the set in ©. Each shape corresponds then to an orbit and we denote the one corresponding 
to the shape in vector z, i.e., the orbit passing through z, by [z]. The quotient space is then the set of 
shapes, i.e., the set of all orbits: 

C n /U(n) = {[z] : zeC n }. (3) 




Fig. 1. Shapes as points in a quotient space. In the shape drawings, the line connecting the landmarks indicates their labeling, i.e., the 
order by which the points are stored. While shape instances that only differ by a re-labeling of the landmarks are mapped to the same point 
in the quotient space, geometrically different shapes are mapped to distinct points. 

Fig. Q] illustrates the scenario, where the canonical (or quotient) map tt : C n —> C n /U(n) maps each 
shape vector instance to its orbit, i.e., 

tt(z) = [z] = {Uz : n e n(n)}. (4) 

From the definitions above, it is clear that two vectors z and w that represent the same shape, i.e., that 
correspond to distinct labelings of the same landmarks, are mapped by tt to the same point in the quotient 
space: tt(z) = tt(w). Also, two vectors z and w that do not represent the same shape, i.e., whose entries 
are not simply related by a permutation, will be mapped by n to distinct points in the quotient space: 
7r(z) 7^ 7r(w), see Fig. [TJ This characteristic is usually referred to as a maximal invariance property, 
meaning that 

tt(z) = tt(w) z and w represent the same shape. (5) 

Although the maximal invariance property in © is precisely what we look for (the map n detects 
whether or not z and w represent the same shape), this mechanism is hardly implementable, due to the 
rather abstract nature of the quotient space and map. 

B. Polynomial signature of a shape 

We now develop a version of the objects introduced above that is suitable for use in practice. In 
particular, we propose to replace the abstract map tt : C n —> C n /U(n) by a map from C n to the set A 
of analytic functions on the complex plane: 

A = {f : C -> C : / is analytic}. (6) 
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Our surrogate map must exhibit maximal invariance with respect to the group II (n), just like the quotient 
map 7r in ([5]). This means that it must map z and w to the same analytic function if and only if z and 
w represent the same shape, i.e., iff [z] = [w], i.e., iff z = Hw, for some permutation matrix II. 

A possible choice for the map from C n to A is such that a shape vector z is mapped to a complex 
polynomial p(z, •), leading to what we call the polynomial signature of a shape. This map z H> p[z, •) is 
defined by the following expressions, where £ is a dummy complex variable: 

n 

p(z,0 = l[{Z-z m )- (7) 

m=l 

Maximal invariance of the polynomial signature. It is clear that the invariance with respect to the 
permutation group holds. In fact, p{z, •) = p(w, •) whenever z,w differ only by a permutation of their 
entries, i.e., whenever z = Tlw, because, from definition ©, it follows immediately that p(TLw, •) = 
p(w, •), due to the commutativity and associativity of the complex product. 

To establish the maximal invariance of the polynomial signature, consider two vectors z and w that 
have equal signatures, i.e., such that the polynomials p{z, •) and p(w, •) are equal. This equality means 
that those polynomials share the same system of roots, including their multiplicities. Since, from ©, 
the roots of a polynomial signature are the complex points in the shape vector, the equality of p{z, •) 
and p(w, •) implies that the (multi-)sets {zi, z 2l . . . , z n } and {wi, w 2) . . . , w n } are equal. This way we 
conclude that the vectors z = \z\ z 2 • • • z n ] T and w = [wiw 2 • • • w n ] T are equal up to a permutation, 
thus proving the maximal invariance of the polynomial signature in ©. 

The practical use of the polynomial signature is limited by its numerical stability. In fact, since a shape 
with n points is represented by a n th degree polynomial, when n is large, the signature exhibits extremely 
large variations, resulting numerically unstable (e.g., sensitive to the noise). 



C. The analytic signature (ANSIG) of a shape 

Inspired by the polynomial signature, we now develop another analytic map a : C n —> A that, besides 
maintaining the maximal invariance property, results numerically stable and robust, leading to what we 
call the analytic signature (ANSIG) of a shape. This map z H> a(z, •), i.e., the ANSIG of the shape 
described by vector z, is defined by 

a(z,0 = -f>^. (8) 
n z — ' 

m=l 

Maximal invariance of the ANSIG. Just like for the polynomial signature, the invariance of the ANSIG 
with respect to the permutation group is obvious. In fact, from ®, a(z, •) = a(w, •), whenever z = Tlw, 
due to the commutativity and associativity of the complex sum. 

To establish the maximal invariance, consider two shape vectors z and w that have equal ANSIGs, 
a(z, •) = a(w, •). We will show that z and w have also the same polynomial signature, p[z, •) = p(w 1 •), 
i.e., that they represent the same shape, differing only by a permutation of their entries. Consider then the 
equal ANSIGs a(z, •) and a(w, •), given by ®, as analytic functions on the complex plane. Obviously, 
their derivatives at the origin also coincide: 

, fc = l,2,...,n. (9) 

Using the definition of the ANSIG map a in ([8]), we write the system of equations © in terms of the 
entries of the vectors z and w: 



f=0 



(T 

-—ra(w, 



z£ + 2$ + >>' + z£ = Wi+w% + >- > + w*, k = 1, 2, . . . , n. 



(10) 
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We now show that the set of equalities (ITOl) implies that the vectors z and w have the same polynomial 
signature, thus represent the same shape. Start by noting that, since the fc th moment of an n th order 
polynomial with roots {ri, r 2 , . . . , r n } is given by 

fik = r k 1 +r k 2 + -.. + r k n , (11) 

the system of equations in (ITOl) expresses the equality of the first n moments of the polynomial signatures 
p(z, •) and p(w, •), see their definition in ©. We now use the so-called Newton's identities, which relate 

the first n moments /x 2 , . . . , of a polynomial a + ai£ + a 2 £ 2 + h a^ 72 , with its coefficients 

{a , ai, . . . , a n }, see, e.g., [38]]: 

^riM/c + cin-il^k-i H h an-k+i^i = -ka n _ k , = 1, 2, . . . , n . (12) 

For monic polynomials, i.e., when a n = 1, such is the case of the polynomial signatures by construction, 
see ©, the Newton's identities (fT2l) are written in matrix format as: 
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(13) 



From the structure of the matrix equality above, it is clear that the moments of a polynomial uniquely 
determine its coefficients, i.e., they fully specify the polynomial. In fact, it is immediate that /xi determines 
a>n-i9 Mi an d ^2 determine a n _i and a n _ 2 , etc. 

The moment equalities in (ITOl) imply then that the polynomial signatures p(z, •) and •) defined in 
© are identical. Thus, using the already established maximal invariance of the polynomial signature, we 
conclude that z = [ z\ z 2 • • • z n ] T and w = [ w\ w 2 • • • w n ] T are equal up to a permutation, proving the 
maximal invariance of our ANSIG map a in ([8]). 



D. Storing the ANSIG 

The ANSIG a : C n —> A maps a shape, i.e., a vector z, to an analytic function a(z, •) on the 
complex plane. Since the space of analytic functions is infinite-dimensional, our map seems inadequate 
to a computer implementation. However, we can store any / e A by exploiting the well-known Cauchy's 
integral formula, whose major consequence is that any analytic function is unambiguously determined by 
the values it takes on a simple closed contour, see, e.g., [|39l . In particular, if we choose the contour to 
be the unit-circle, i.e., the unit-radius circle centered at the origin, S 1 = {z e C : \z\ = 1}, the analytic 
function / is uniquely determined by {f(e jLp ) : (p £ [0, In practice, this is approximated by sampling 
/ on a finite set of K points uniformly distributed in the unit- circki i.e., on {l, W K , W%, . . . , W^ 1 }, 
where 

W K = e j % . (14) 

In summary, we approximate the ANSIG map a : C n — > A by its discrete counterpart a K 
where the discrete version of the ANSIG of z is then given by 

a K (z)=[a(z,l) a(z,W K ) a{z,W 2 K ) ■■■ a (z, W% ~ 1 )] T . 

2 In all the experiments reported in this paper, we used K — 512. 



: C n ->■ C K , 
(15) 
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E. Illustration 

To illustrate the invariance of the ANSIG to the landmark labels, we use two shapes whose only 
difference is the order by which the landmarks are stored. In Fig. [2l we represent such a pair of shapes. 
Note that the set of landmarks (black dots) of the left image is the same than that of the right one. To 
indicate the order by which the landmarks are stored, we use a line connecting them, thus the images 
look different, illustrating the quagmire of having to compare shapes when the correspondence between 
the sets of landmarks is unknown. 

Shape Re-ordered shape 




-1.5 -1 -0.5 0.5 1 1.5 -1.5 -1 -0.5 0.5 1 1.5 

Fig. 2. Two observations of the same shape, described by the positions of a set of landmarks, here represented by the black dots. The line 
connecting the dots, represented only for illustrative purposes, indicates the order by which the points are stored. Thus, although the left and 
right images represent the same shape, this is not trivially inferred from the stored data. 

We compute the ANSIG of each of the shapes in Fig. O As expected, although the vectors containing the 
landmarks for these two shapes are completely different (due to re-ordering), we obtain the same ANSIG, 
which is represented in Fig. [3l This illustrates that the ANSIG is a permutation invariant representation. 
Obviously, the discrete counterparts of the ANSIGs, i.e., their samples on the unit-circle, also represented 
in Fig. [3l are also equal. Since the ANSIGs of geometrically different shapes result distinct, this signature 
is a substitute of the abstract quotient map tt in Fig. [TJ The reader may also wonder why the middle 
and right plots in Fig. [3] are periodic (period tt). This fact is due to the rotational symmetry (of tt) of 
the shapes represented in Fig. [2l We postpone to the following section a more detailed comment on this 
aspect. 



ANSIG 




Fig. 3. The ANSIG of the shapes in Fig. [2 In the left, the real part is represented by the surface high and the imaginary part by its color. 
In the middle and right plots, the discrete version of the ANSIG, i.e., its unit-circle samples. Although the vectors containing the shapes in 
Fig. [2 are distinct, due to re-ordering, their ANSIGs results equal, as desired. 

III. QUOTIENTING OUT SHAPE-PRESERVING GEOMETRIC TRANSFORMATIONS 

We now address how the ANSIG representation handles shape-preserving geometric transformations, 
such as translation, rotation and scale. For reasons that will become clear below, we require that the shape 
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vector z = \z\z 2 • • • z n ] T does not have all its entries with the same value, i.e., we require that z G C™, 
where = C n — {[zz ••• : zGC} (note that only the pathological cases when all the landmarks 
collapse into a single point are excluded). 

As introduced in the previous section, a shape is an orbit generated by a group G of plausible 
transformations. Our main goal is to conceive a computationally efficient mechanism to detect if two 
given vectors z,w G are in the same orbit, i.e., if they represent the same shape. In the previous 
section, we considered the permutation group G = II (n), i.e., we considered that z and w represent the 
same shape if they are equal up to a permutation. We introduced a maximal invariant with respect to 
this group: the ANSIG a : — » A is such that a(z, •) = a(w, •) if and only if z and w are in the 
same orbit. Building on this result, we now extend the group G to also accommodate shape-preserving 
geometric transformations. 



A. Group of geometric transformations 

Naturally, two vectors z, w G represent the same shape if they are equal, up to, not only a permuta- 
tion of their entries, but also a translation, rotation, and scale factor, affecting the set of points they represent 
in the plane. To take this into account, we consider the group of transformations G = U(n) x C x S 1 x M + , 
where II (n) is the set of permutation matrices, S 1 is the unit-circle in the complex plane, and IR + is the set 
of positive real numbers. We then consider the action of G on as the map G x Q ^ Q, defined by 

(II, t, e? e , A) • z = Xe j6 Uz + tl n , (16) 

where l n = [ 1 1 • • • 1 ] T is the n-dimensional vector with all entries equal to 1. It is clear from (fT6l) 
that the action of an element of the group G on a given vector z corresponds to a permutation (II), 
translation (£), rotation (8), and scaling (A), applied to the shape in z. Naturally, the group operation takes 
into account the composition of rigid geometric transformations: 

(ni,ti,e>'* Ai) • (n 2 ,t 2 ,e^,A 2 ) -z = (U u t u e? e \ A x ) • {\ 2 e^U 2 z + t 2 l n ) 

= X 1 e^U 1 (X 2 e' e m 2 z + t 2 l n ) + hl n 
= AiA 2 e^ 1+fe) IIiII 2 z + Xie je H 2 l n + t{l n 

= (n x n 2 , \^ 6 n 2 + t u e^^.x^) - z . (17) 

The shapes that are equal to the instance z, up to a permutation, translation, rotation, and scale, form 
the orbit [z], the set obtained by considering the action of all elements of G on z, 

[z] = { (n, t, e? , A) • z : (II, t, e^, A)gG}. (18) 

The problem of deciding if two given vectors z, iu represent the same shape, i.e., if they are in the same 
orbit, corresponds to checking if the value of the optimization problem 

min \\z- (n,t,e^,A) -tyll (19) 

is zero (or, in practice, below a small threshold, due to the noise). To the best of our knowledge, solving (fT~9l) 
without making use of permutation invariant representations requires an exhaustive search over the set 
II(n), which has cardinality n\. Clearly, this is not feasible, even for moderate values of the number of 
landmarks, say n = 100 (100! ~ 10 158 ). In opposition, in our experiments, we use shapes described by 
very large sets of points, e.g., with n up to 40000. 
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B. Translation and scale 

We use the ANSIG map to devise a scheme that circumvents the combinatorial search in ([T9l) . We start 
by quotienting out all the transformations, except the rotation, through a map (j) : —t A. Then, we 
show that (j) is equivariant with respect to the rotations, i.e., that its action commutes with the one of the 
group of rotations. This will enable a computationally simple scheme to decide from <f>(z) and (j)(w) if 
the vectors z and w correspond to the same shape. 

To factor out translation and scale, we pre-process the shape vector, through centering and normalization, 
before computing its signature. This corresponds to considering the composite map cp : —> A, 
defined by 




where a is given by ®, z is a vector with all entries equal to the mean value of z, i.e., z = ^l^zl n , and 
||-|| denotes the 2-norm, i.e., \\z\\ = Vz T z. It becomes clear now why we excluded the shape vectors with 
all equal entries, i.e., why we imposed z E C™: this guarantees z ^z, thus \\z — z\\ ^ in ([20b . Since 
in the remaining of the paper we focus in comparing shapes affected by the above mentioned geometric 
transformations, we make (ft supersede a, i.e., we refer to the analytic function </>(z,-) as the analytic 
signature (ANSIG) of the shape z, and to the composite map cj) in (l20l) as the ANSIG map. 

The reader may wonder why the factor yjn in (l20l) . In fact, this factor has nothing to do with factoring 
out translation or scale — it is constant for shapes described by the same number n of landmarks. 
The motivation for this factor is precisely to enable dealing with shapes described by different numbers 
of landmarks, providing robustness to over/under sampling the shapes. To demonstrate this capability, 
consider an extreme example, where each landmark in the n-dimensional vector z G is repeated p 
times, leading to the pn-dimensional vector z p , i.e., 

z p = l p ®z, (21) 

where ® is the Kronecker product. Naturally, z and z v represent the same shape. We now show that the 
factor yjn in (l20l) guarantees that they have the same ANSIG0. 

To simplify the notation, name w the post-processed version of the shape vector z, i.e., 

W V^TT" ~~T7 • (22) 

\\z — z\\ 

From the relation between z p and z in (I2TT) . it immediately follows that 

z p - z p = l p (g) (z - z) and \\z p - z p \\ = ^Jp \\z - z\\ , (23) 
thus the post-processed version of z v is just the p-times replication of the one of z in (l22l) . 

1 1 ^p II 11^ ^11 



3 Although this extreme example does not happen in practice, it is useful to illustrate what may happen when comparing shapes that, 
although similar, have been (very) differently sampled, leading to vectors of (very) different dimensions. This is frequent, for example, when 
the shapes to compare come from the edge maps of images of very different resolutions. 
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The ANSIG of z v is now successively written as 

0(z p ,O = a (l p ® 117,0 (25) 

- ( J] e Wm * + J] e Wmi + • • • + J2 eWmi ) ( 26 > 

P \ — 1 m — 1 m — 1 ' 



m=l m=l m=l 



p times 
1 

- V e Wm * (27) 



U 1 
m=l 



= a(w,0 (28) 
= 4>(z,0, (29) 

establishing the desired equality between the signatures of z p and z. Equalities (|25l) and (l29l) come from 
the definition of (j) in ([20j); in (l26l) and ([28]) we used the definition of a in ([8]). 

Maximal invariance with respect to permutation, translation, and scale. We now show that is a 
maximal invariant with respect to all transformations, except rotation, i.e., that two shapes have the same 
ANSIG if and only if they are equal up to permutation, translation, and scale. 
The maximal invariance of (j) can be formalized as 

•) = 0(u>, •) & z = (n, t, 1, A) . w , (30) 

for some (II, t, 1, A) e G. The invariance, the sufficiency part (<=) of (l30lh is straightforward: 

0(z,-) = 0((n,t,l,A)-w,-) (31) 
= ^(AIIit; + *l n ,-) (32) 



AIIw + il n - AIIw + tl n i 

n-n =77, • (33) 

\\XUw + tl n ~ AIIti; + 

= a h/^V =I| ' * ) (34) 

\ \\w — w\\ J 

= aU- — ,. (35) 

\ \\w — w\\ J 

= (36) 

where we just used the definition of 0, standard manipulation of matrices and vectors in (l34l) . and the 
invariance of a with respect to permutations in (l35l) . 

We now prove the maximal invariance, i.e., the necessity part (=>) of the equivalence (l30l) . Naturally, the 
maximal invariance of cp also hinges on the maximal invariance of a established in the previous section. 
Suppose that 0(z, •) = (/>(w, •). Then, from the definition of (j) in (l20l) . 

^TT~ =ip ' J = a f V^TT~ =¥' ' J ' (37) 

z — z\\ J \ \\w — w\\ J 

and, by the maximal invariance of a with respect to permutations, equality (1371) implies that 

/— z—~z „ ^ w — w 

V™ n z-rr = u Vn T , zztt , (38) 

\\z — z\\ \\w — w\\ 

for some II G U(n). The relation between z and w in (1381) can be rewritten as 



(39) 
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It is now clear that this relation is of the form z = AITu; + tl n , establishing that z = (II, 1, A) • w for 
a particular choice of (II, t, 1, A) G G, thus completing the proof of (l30l) : just define 

A = j„ > and t= -l£ (z - Xw) G C . (40) 

||^ — tu|| n 

C. Rotation 

We have seen that the ANSIG map quotients out all transformations in G, except the rotation. We 
now show that although (ft is not invariant to rotations, it is equivariant, i.e., that its action commutes with 
the one of the group of rotations. Naturally, this equivariance is inherited from the map a in ([8]). Let us 
then start by looking at how the rotation of a shape z affects the analytic function produced by the map 
a. The rotated version of the shape is simply e j0 z. From the definition of a in ([8]), it follows that 

a(e * Z) £) = I y e ^ = I y e **(e*o = a( ^ _ (41) 

fc=i fc=i 

Equivariance with respect to rotation. The property just derived (|4T]) leads immediately to the equiv- 
ariance of the ANSIG map cj>, through the following sequence of equalities: 

<f>((n,t,e>°,\)-z,S) = <j>(\e>°Ilz + tl n ,Z) (42) 

n^^,e) (43) 
\\z — z\\ J 

( ^e jd (z-z) A 
= a\^/E ^_ (44) 

= <t>(z,e?°Z), (46) 

where (l43l) and (146*1) use the definition of in (l20l) and simple manipulations, (|44l) uses the invariance of 
a with respect to permutations, and (|45l) comes from (|4TT) . 

We thus conclude that the rotation of a shape propagates to its signature, i.e., the ANSIG of the rotated 
shape is simply a rotated version (in the complex plane) of the ANSIG of the original shape. Obviously, 
the equivariance is inherited by the restriction of the ANSIG to the unit-circle, cj> s i : C" x S 1 — y C, 
<Mz,0 = <f>(z, 0: 

0si ((n, *, A) • z, £) = 0si (*, e*£) ■ (47) 

Equality of orbits. In summary, two vectors are in the same orbit of the group G of relevant transfor- 
mations, i.e., they represent the same shape, up to permutation, translation, rotation, and scale, iif (the 
restriction to the unit-circle of) their ANSIGs are equal, up to a rotation: 

z and w represent the same shape o </V (z, •) = </V (to, e j6> •) , for some 6 G [0, 2ir]. (48) 

The necessity part of this claim (=>) comes immediately from (1471) . To prove the sufficiency (<=), 
suppose that 0gi(z, •) = </V e jd •), for some G [0, 27r]. This means that 

e J ^) = ^(w, e j6 e JLp ) , for all <p G [0, 2tt]. (49) 

Using the equivariance of 0, derived in (I42H461) . in the right-hand side of (l49lh we get 

<f)(z, e JLp ) = (f)(e j6 w, e JLp ) , for all p G [0, 2tt] , (50) 

stating that the analytic functions <\>(z, •) and (fi(e jd w, •) coincide on the unit-circle, thus, on the entire 
complex plane: 

^•)^(e J V| (51) 
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From the maximal invariance of the map (/), stated in ([30b , equality (I5TT) is equivalent to 

z= (II, t, 1, A) -e^u;, (52) 

for some (II, t, 1, A) e G. Using the composition of elements of G, expressed in (fTTl) . equality (1521) can 
be rewritten as z = (II, t, e j6> , A) • iu, which shows that z and it; are in the same orbit, i.e., that they 
represent the same shape. 

D. Illustrations 

We now illustrate the behavior of the ANSIG representation in what respects to the properties studied 
in this section, i.e., dealing with shape-preserving geometric transformations and with shapes described 
by point sets of different cardinality (i.e., with different sampling density). 

Shape-preserving transformations. We use the binary images shown in Fig. HI which, besides a point 
permutation (not seen in the images), also differ by translation, rotation, and scale factors (easily perceived 
due to different position, orientation, and scale of the shapes). 



Shape 1 Shape 2 




Fig. 4. Two shapes that differ by a geometric transformation: the right one is obtained by translating, rotating, and resizing the left one. 
The ordering of the points describing the shapes is also made distinct on purpose, as in the example of Fig. [2 

We computed the (unit-circle samples of the) ANSIGs of the shapes in Fig. HI obtaining the magnitude 
and phase plots in Fig. \5\ Note that, in spite of the different vectors describing the shapes in Fig. HI both 
ANSIG plots in Fig. [5] only differ by a (circular) translation. This is in agreement with what we concluded 
in this section, see expression (l47l) : permutation, translation, and scale, are factored out; the rotation of a 
shape induces the same rotation on its ANSIG, thus a (circular) translation of the magnitude and phase 
plots of its restriction to the unit-circle. The rotation that aligns the shapes can be efficiently computed, 
as will be described in Section HVl 




e (rad) e (rad) 

Fig. 5. Magnitude and phase of the ANSIGs of the two shapes of Fig. |4] Note that, although these shapes differ by position, rotation, 
scale, and point labeling, their ANSIGs only differ by a (circular) translation that can be easily computed. 

Different sampling density. The experiment above used shapes described by sets of points of the same 
cardinality. We now illustrate that the ANSIG also copes with shapes described by point sets of (very) 
different cardinality, as also discussed in this section. This characteristic is very important in practice, 
e.g., to deal with shapes obtained from images of different resolutions (due do the pixelization, the same 
shape usually leads to point sets of different cardinality). 
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We use the shapes represented in Fig. [6l Both represent the same object but the shape in the right is 
described by just 30% of the points of the one in the left. In Fig. Ul we represent the ANSIGs of both 
shapes. As easily seen, the ANSIGs of the differently sampled shapes almost coincide, as desired, and 
anticipated in this section, see the insight provided by expressions (I25H29I) . All our tests confirmed that 
the ANSIG representation deals well with different sampling densities, at leat as far as the corresponding 
shapes are recognizable by an human. 



Shape 1 ^ s h a P e 2 




Fig. 6. Similar shapes with different sampling density. The right image has 30% of the points of the left one. 




e /rad 6 /rad 



Fig. 7. Magnitude and phase of the ANSIGs of the shapes in Fig. [6] Although the number of points describing these shapes differ 
significantly, their ANSIGs result similar, as desired. 

We finally comment on the fact that the ANSIG plots in Fig. [3] of the previous section are periodic 
(they exhibit not only the 2tt periodicity of any ANSIG but also a smaller fundamental period of tt). 
This periodicity is due the fact that the shape corresponding to those plots is invariant under rotations of 
multiples of n (see Fig. [2|). Since shape rotation leads to a translation of the ANSIG plots, the plots for 
the shape in Fig. [2| result invariant to translations of multiples of 7r, i.e., they are periodic with period 7r, 
therefore two periods appear in the interval [0, 2tt]. Naturally, this does not happen with shapes that do 
not exhibit rotational symmetry, e.g., the ones in Figs. @] and O whose ANSIG plots in Figs. \5\ and [7] 
exhibit only 2tt periodicity. 

IV. Shape-based recognition 

In this section we describe how the ANSIG representation can be used for rigid shape recognition. Since 
our representation is invariant to (or deals gracefully with) the relevant transformations (point labeling, 
translation, rotation, and scale), we are able to perform shape classification using a very simple scheme: 
the shape database is just composed by the ANSIGs of prototype shapes (a single prototype per shape 
class); classification boils down to comparing the ANSIG of a candidate shape with the ones in the 
database. 

Our experiments show that the invariance of the ANSIG representation enables dealing with noise and 
distortions typical of shapes obtained from real images, with the simple strategy just outlined, avoiding this 
way computationally complex learning schemes. We emphasize, however, that the ANSIG representation 
can be incorporated in more sophisticated classification procedures if non-rigid shape classification is the 
goal. For example, storing the ANSIGs of several prototypes per class and using k-NN (nearest neighbors) 
classification is straightforward. 
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A. Efficient comparison of ANSIGs 

As described in Section [DJ a practical way to implement the map a : —> A is through its unit-circle 
sampled version a K : — » C K , introduced in (fT5l) . Thus, in practice, the unit-circle ANSIG restriction 
0§i is approximated by its sampled version (j) K : — » C K , given by 

0^0) = a x ( y/n-^- — = 



(53) 

where, we recall, K denotes the number of samples on the unit-circle. Naturally, with K sufficiently large, 
any rotation e jd is well approximated by a point in the unit-circle sampling grid, i.e., e j0 ~ e j ^ k = Wj^ 9 
thus the equivariance of 0§i, expressed in (l47lh gracefully transfers to its discrete version (f> K , as 



h K ((II, A, v, e i0 ) • z) ~ mod fc , 



(54) 



where mod k denotes a fc-step cyclic shift. 

Our test for deciding if two point vectors z and w correspond to the same shape, i.e., for the equality 
of the orbits of z and w, expressed in (l48lh leads to checking if (j>K{iv) is a cyclic- shifted version of 

(/>k(z) — (I>k(w) m °d k , for some k = 0, 1, . . . , if — 1. (55) 
This test can be carried out by computing the cyclic-shift k* that best "aligns" the vectors, 

k* = arg min ||0k(^) — 4>k(^) mod k\\ 2 , (56) 

fc=0,l,...,X— 1 

and then checking the similarity of the corresponding "aligned" versions. 

Solving ([56b by computing (/>k{w) mod k and comparing it with (/>k(z) 9 for each fc, leads to an algorithm 
with computational complexity 0{K 2 ). We now present a computationally simpler scheme, based on the 
Fast Fourier Transform (FFT). We denote the Discrete Fourier Transform (DFT) of a vector v G C K by 
the vector v G C K , given by 



v 



with 



(K-l)k 



K 



T 



(57) 



In (1571) . £>x is the K x if DFT matrix ll40lh denotes its conjugate transpose (also known as the 
Hermitian), and d k is the fc th column of D^. 

Using Parseval's relation [|40lh which is based on the fact that the DFT is an unitary operator (D^D K = 
Ik) 9 the minimization (1561) is written in the frequency domain as 

2 



arg mm 

fc=0,l,...,K-l 



6x(z) — (/>k(w) mod 



(58) 



Since the DFT of a fc-cyclic-shifted version of a signal equals the DFT of the original signal multiplied 
by the exponential sequence \[Kd k [40J, expression (l58l) is equivalent to 



k* 



arg mm 

k=0,l,...,K-l 



</>k(z) ~ </>k{w) QVKd k 



(59) 



where denotes the elementwise (also known as Schur or Hadammard) product. Removing from the 
norm in (l59l) the terms that do not depend on fc, we finally get 



fc* = arg max Re i d^ 

k=0,l,...,K-l 



» © <f> K (w) 



(60) 



where v denotes the conjugate of v and Re{v} its real part. 

Expression (l60l) provides a computational simple scheme to compute the shift fc*JJiatbest "aligns" the 
ANSIGs </>k(z) and c/)k(w) of two shape vectors z and w: i) compute the DFTs </>k(z) and (j) K {w)\ ii) 
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compute the elementwise product ^(^©^(tu); iii) compute the DFT of this product and locate the entry 
with largest real part. Since each DFT is computed by using the FFT algorithm, which has computational 
complexity 0[K log K), the overall complexity of our comparison scheme is 0[K + 3KlogK). 

Finally, to measure the similarity between the shapes in z and w, i.e., the similarity between their 
"aligned" ANSIGs (/>k(z) an d (/>k(w) m od we use the (cosine of the) angle between the corresponding 
vectors: 



i/j(z,w) 



•>k\ 



(w) 



\\M*)\\ 



(61) 



Thus, the similarity ip(z,w) is such that < ip(z,w) 
are the shapes in z and w. 



< 1 and the larger is i/;(z,w) 9 the more similar 



B. Improving robustness when dealing with interior shape detail 

Before presenting experiments with shape classification, we anticipate that when a single ANSIG is 
used to describe a shape characterized by having specific details occurring at very different distances from 
the geometric center, the representation may lack the desired robustness. This is illustrated by the example 
in Figs. [8] and [9j Fig. [8] shows two shapes characterized by a similar large outer hexagon and distinct 
small inner polygons. The ANSIGs of these two shapes are represented in Fig. [9j As easily perceived, 
these signatures result very similar. Although one could argue that the shapes in Fig. [8] are in fact similar, 
the argument would be misleading, since if the small polygons were not in the center but in the periphery 
of the image, the ANSIGs would be far more distinct. We now briefly discuss this issue and ways to deal 
with it. 



Shape 1 



Shape 2 



O 



-0.5 0.5 



Fig. 8. Two shapes that only differ in the points that are close to their center. 



-0.5 0.5 




3 9(rad) 4 5 6 7 012 3 ( rad ) 4 

Fig. 9. ANSIGs of the two shapes in Fig. [8] The similarity of the exterior hexagons makes the signatures very similar. 

Weighted ANSIG. The similarity of the ANSIGs in Fig. [9] is explained by the exponential weighting of 
each landmark modulus (i.e., distance to the geometric center), when computing the map a in ®, in which 
the ANSIG is based. In fact, representing the elements of a shape vector z by their polar coordinates, 
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i.e., z m = p m e j0m , the map a in ([8]) is written as 

a(z, = 1 E * Pme39mi = ~ E e*»e^ , (62) 

m=l m=l 

which shows that landmarks at large distances from the origin (i.e., with large p m ), are taken into much 
larger account (exponential weighting) than those closer to the center (small p m ). 

One approach to tackle this problem is precisely to weight the landmarks in a different way, with the 
care of maintaining the maximal invariance property. For example, if the shapes are preprocessed by an 
elementwise map g, that maps each entry p m e J<9m to log (1 + p m ) e j0m , the signature obtained corresponds 
to replacing the map a in (l62l) by a map a g , given by: 

a g (z,0 = a(g(z),0 = \ J> + PmY^ , (63) 

m=l 

which now takes into account the modulus of the landmarks, i.e., their distance to the origin, in a linear 
way. It is straightforward to show that the maximal invariance of the ANSIG is not affected, because g 
is an injection, i.e., there is a one-to-one mapping between z and g(z). 

In Fig. [10] we shown the ANSIGs obtained for the shapes of Fig. El now using the differently weighted 
map a g in (l63l) . Comparing the plots of Figs. [9] and OH we see that the shape signatures are in fact more 
dissimilar, as desired, when the weighted version of the ANSIG is used. 




3 0(rad) 4 012 3 e (rad) 4 

Fig. 10. Magnitude and phase of the ANSIGs of the two shapes in Fig. [8] when the map a in ^ is replaced by the weighted version a g 
in d63l) . As desired, the signatures result more distinct than the ones in Fig. [9] 

Using more than one ANSIG. Rather than trying to describe shapes such as the ones in Fig. [8] with 
a single signature, an approach that must be considered is to use multiple descriptions. In fact, if we 
compute the ANSIGs of just the inner parts of these shapes, i.e., of the smaller inner polygons (the square 
and the triangle), we obtain the plots in Fig. QT1 which are very distinct. This motivates the use of more 
than one ANSIG to distinguish between shapes like the ones in Fig. [8l i.e., shapes with similar outer 
content. 




3 0(rad) 4 5 6 7 012 3 e ( rad ) 4 

Fig. 11. ANSIGs of the small square and triangle in the shapes of Fig. [8] Naturally, when this inner part is taken into account separately, 
the signatures result very distinct. 



Although several hypothesis could be considered, we describe a simple way to implement a shape 
representation scheme based on two ANSIGs: the ANSIG of the entire set of landmarks, cj) (z), and the 
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ANSIG of some subset of inner landmarks, (p(z in ). Naturally, z in can be obtained from z in several 
ways, e.g., by selecting the subset of points that have normalized absolute value smaller than one. Note 
that this joint signature keeps the invariance properties of the single ANSIG representation: the maximal 
invariance is obvious, since we maintain the ANSIG of the complete shape; the rotational equivariance 
comes from the fact that the partition is circular. 

When comparing shapes z and w under this framework, the cyclic-shift fc* that best "aligns" simulta- 
neously both pairs of ANSIG vectors, (j) K (z) with cj) K (w) and cj) K (z in ) with cj) K (w in ) is obtained, by 
following steps similar to the ones in (1561) to (l60lh as: 



fc* = argmin||0 K (z) - cj) K (w) mod k\\ + \\cj) K (z in ) - cj) K (w in ) mod k\\ . (64) 

k 



arg max Re < 



</>k(z) © <Pk(w) + <l>K(Zin) © &K(Win) 



(65) 



To measure the two-ANSIG similarity, we can use an weighted average of the similarities ij)(z,w) 
and i/j [z in , w in ), defined in (I6TT) . A natural option is to weight each partial similarity by the corresponding 
number of shape points that are taken into account. 



V. Experiments 

We now present experiments that demonstrate the usefulness of the ANSIG in shape-based classification. 
We start by evaluating the robustness to noise, using synthetic data. Then, we illustrate an application 
to automatic trademark retrieval, using real images. Finally, we discuss the behavior of the ANSIG 
representation when in presence of model violations. 



A. Robustness to noise 

To evaluate the robustness of the ANSIG representation to the noise affecting the landmark positions, 
we build a particularly challenging scenario with a database of four geometric shapes — a circumference, 
an hexagon, a square, and a triangle — that are difficult to distinguish when in presence of noise. In 
Fig. [El we show sample test shapes. They are noisy (and translated, rotated, scaled, and re-ordered) 
versions of the four geometric shapes. Note how the circle and the hexagon become similar for high 
levels of noise. 

We then used the classification strategy described in the previous section: our database was composed 
just by the ANSIGs of the four noiseless polygons and the ANSIG of each noisy test shape was classified 
by selecting the most similar entry in the database. We tested the classifier by performing 2000 tests for 
each of the four shapes, at each noisy level, obtaining 100% correct classifications for in all tests with 
SNR above around 28dB. Note that shapes with this level of noise are far from being visually "clean", see 
the illustrations in the first column (the leftmost plots) of Fig. [T2l The results are summarized in Tab. H 
where we see that a few classifications errors happen for higher levels of noise, when dealing with circles 
or hexagons, which as noted before, for such levels of noise, are in fact difficult to distinguish, even for 
humans, see the rightmost plots of the first two lines of Fig. [12l 
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Fig. 12. Sample test shapes. From top to bottom, the circle, the hexagon, the square, and the triangle, with varying noise level, translation, 
rotation, and scale factors, and point labeling. Noise increases from left to right. Note that, for high levels of noise, the shapes become 
difficult to distinguish, particularly the circumference and the hexagon become very similar. 



Although not the focus of this paper, we now illustrate that the ANSIG representation can also be used 
when dealing with non-rigid distortions, if more than one prototype per class is included in the database. 
We use the subset of the MPEG-7 shape database | HT]| . [1421] show in Fig. [13l It has 18 classes, each 
containing 12 shapes that, although perceptually similar, are not geometrically equal. We increase shape 
variability by creating test shapes that are noisy versions of the ones in Fig. [13] (the noise levels are 
illustrated with an example in Fig. fT4l) . 



V vw^ V vvwv-v / / // // /;//// 

************ I i I 1 I i I ! I 1 I ! 

„m *m mm <m *m f *m mm *m *m *m mm fa fa ^ ^ fa M fa fa 

9 9 9 9 9 9 9 9 9 9 9 

• f i ft— ^ ^ v^^^¥¥vV vv vv 

Fig. 13. The 216-shape subset |42| of the MPEG-7 shape database [41], containing 12 shape classes, each with 18 shapes. 



We stored the ANSIGs of the noiseless shapes and then classified a set of 43200 test shapes (200 for 
each sample shape in Fig. [13]), for each of the two noise levels illustrated in Fig. [TH using a simple 1-NN 
classifier, i.e., selecting the class with most similar ANSIG. We got 100% correct classifications for both 
noise levels. See ll36ll for more details on this experiment and ll37ll for other experiment that demonstrates 
robustness to noise with a clip-art shape database. 
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a=0.5 



a=1.5 





Fig. 14. Illustration of the noise levels used in the MPEG-7 shape database illustrated in Fig. \\3\ 



B. Real images 



We now describe an experiment with real images of trademarks. We obtained from the internet a set of 
images of logos that are well described by its shape content. These images range from 80x80 to 500x500 
pixels. To obtain shape vectors from intensity images, i.e., sets of points or landmarks that describe the 
shape content, there is a panoply of low- and mid-level processing methods that can be used, ranging from 
the simple image intensity threshold to sophisticated segmentation procedures. In this experiment, we just 
processed the images with the Canny edge detector H3ll and then stored the ANSIGs of the corresponding 
edge maps in a database. 

To test the performance of our shape recognition scheme with challenging test images, we printed the 
trademark images and photographed their paper version with a low quality digital camera, with different 
paper-camera positions, orientations, and distances. This way we obtained a set of 88 test images, where 
the candidate logos appear at different sizes and positions, see some examples in Fig. [151 
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Fig. 15. Examples of trademark images to be classified. 

After running the Canny edge detector H3ll on the test images, we performed the shape-based classifi- 
cation in the same way as above, i.e., by just selecting the database entry that was more similar to each 
test shape ANSIG. The results are summarized in Table III in the form of a confusion matrix. We see 
that the generality of the test images were correctly classified. Exceptions are: one of the photographs of 
the fifth logo (12.5%); and several photographs of the tenth one (50%, 37.5%). In the sequel, we discuss 
these miss-classifications. 
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C. Sensitivity to model violations 

We now discuss how the behavior of the ANSIG representation is affected by model violations that 
may arise when dealing with shapes obtained from real images. In particular, we address two kinds of 
model violation: failures in edge detection and perspective distortions. 

Edge-detection failures. Many reasons may motivate gross failures in edge detection. For example, too 
many edges may be detected, due to image noise or, in opposition, too few edges may be detected, due 
to the absence of abrupt enough changes in the image intensity. The examples in Fig. [16] illustrate these 
situations. In the top left, there is one of the images in the trademark database. In the top middle and 
right, two test images that are photographs of the same logo. The photograph in the middle was obtained 
with a very large zoom (around 20 x) and the right one was obtained in low light. Both lead to edge 
maps that are very distinct to the one in the database, see the plots in the bottom row of Fig. [161 The 
trademark images in Fig. [16] are precisely the ones that originated the miss-classifications in the tenth row 
of Table [III This is explained by the fact that our ANSIG representation, although invariant to permutation, 
translation, rotation, and scale, and robust to sampling density, can not cope, without further developments, 
with severe degradations such as the ones in the edge maps of Fig. [161 

Edge detection failures may also happen when processing out-of-focus photographic images. Figs. fT71 
and [19] present two extreme cases. For the out-of-focus test image of Fig. [171 some edge points are missed, 
but the corresponding ANSIG, represented in Fig. [TH results similar to the one in the database, leading 
to a correct classification. This is due to the robustness of the ANSIG representation in what respects 
to shape sampling density, recall the derivation in Section [III] and its illustration in Figs. [6] and [71 In 
opposition, the out-of-focus test image of Fig. [19] results very smooth, originating an highly incomplete 
edge map (compare the middle plots of Fig. [19]). This is the reason for the classification error in the fifth 
row of Table [III 

We emphasize, however, that different pre-processing schemes may easily improve the results. In fact, 
one of the advantages of the shape representation scheme we propose in this paper is precisely the fact 
that it does not require edge segments, dealing equally well with shapes described by arbitrary sets of 
points. As an example, we extracted different shape vectors from the pair of images in Fig. QjE by using 
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Fig. 16. Failures in edge detection. Top: images; bottom: corresponding Canny edge maps 1 43 ]. Left: image in database; middle: photograph 
with large zoom, for which the perceived texture of the paper originates spurious edge points; bottom: photograph at low light, for which 
several edge points are not detected, due to less abrupt changes in image intensity. 
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Fig. 17. Recognition of an out-of-focus image. From left to right: database image, corresponding edge map, edge map of out-of-focus 
image, out-of-focus image. See the ANSIGs of the middle plots in Fig. [18] 



a simple intensity threshold, see the resulting shapes in the middle plots of Fig. [2Ql Since thresholding 
is much more insensitive to focusing effects, the shapes result more similar and the ANSIG successively 
captures this similarity, see Fig. [2TJ 

Perspective distortion. A different kind of model violation that may occur when dealing with 2D 
shapes extracted from photographic images is due to perspective distortion. In fact, while the ANSIG 
representation was developed assuming the shapes are 2D-rigid, this may not be the case, even when 
only flat objects are considered, if the camera is not adequately oriented. To illustrate the behavior of the 
ANSIG representation when in presence of perspective distortions, we photographed one of the trademark 
images, purposely from directions not perpendicular to the paper. Some of the corresponding edge maps 




e/rad 6 /rad 

Fig. 18. ANSIGs of the shapes in Fig. \T7\ In spite of the unfocused image that lead to the incomplete edge map, the ANSIGs are very 
similar (recall that the circular shift is due to different image orientation and it is easily taken care of). 
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Fig. 19. Failure in recognizing an out-of-focus image, due to the highly incomplete edge map. 






Fig. 20. Comparing the same pair of images of Fig.fl9l now using intensity thresholding as the pre-processing step, leads to better results. 
See also the ANSIGs of the middle plots in Fig. [21] 



are shown in Fig.[22l with perspective distortion increasing from left to right. The plots in Fig. [23] compare 
the ANSIG of the leftmost (undistorted) shape of Fig. [22] with the ANSIGs of the perspectively distorted 
ones. We can see that, although this effect was not taken into account in our modeling, the ANSIG 
representation can deal with small perspective distortions (see the similarity of the thick solid, dashed, 
and dot-dashed lines in the plots of Fig. [23]). Naturally, when the distortions are severe, our representation 
fails to adequately capture the shape similarity (see the thin solid lines in the plots of Fig. 1231) . 

VI. Conclusion 

We proposed a new method to represent 2D shapes, described by a set of unlabeled points, or landmarks, 
in the plane. Our method is based on what we call the analytic signature (ANSIG) of the shape, whose 
most distinctive characteristic is its invariance to the way the landmarks are labeled. This makes the 




3 e/rad 5 e/rad 

Fig. 21. ANSIGs of the shapes in Fig. [20] In spite of the unfocused image, the ANSIGs are very similar (again, the circular shift is due 
to different image orientation and it is easily taken care of). 
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ANSIG particularly suited to cope with shapes described by large sets of edge points in images. We 
illustrated its performance in shape-based classification tasks. 

We envisage paths for future research and development based on the ANSIG representation. In this 
paper, we store ANSIGs by sampling them on the unit-circle. A topic that deserves further study is not 
only the choice of sampling rate but also the adoption of different sampling schemes, e.g., the use of two or 
more concentric circles for robustness. The derivations in this paper are targeted to the representation and 
comparison of complete shapes. However, in many practical scenarios, it is also necessary to recognize 
a set of points as being a part, i.e., a subset of a given shape. A good challenge is then to adapt the 
ANSIG representation to deal with incomplete shapes. Finally, in our experiments, shapes are obtained 
directly from the (noisy) output of the edge detection process. Naturally, intermediate processing steps, 
e.g., the popular morphological filtering operations, would lead to "cleaner" shapes, thus to more accurate 
classifications. 
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